Chemoproteomic capture of RNA binding activity in living cells

Proteomic methods for RNA interactome capture (RIC) rely principally on crosslinking native or labeled cellular RNA to enrich and investigate RNA-binding protein (RBP) composition and function in cells. The ability to measure RBP activity at individual binding sites by RIC, however, has been more challenging due to the heterogenous nature of peptide adducts derived from the RNA-protein crosslinked site. Here, we present an orthogonal strategy that utilizes clickable electrophilic purines to directly quantify protein-RNA interactions on proteins through photoaffinity competition with 4-thiouridine (4SU)-labeled RNA in cells. Our photo-activatable-competition and chemoproteomic enrichment (PACCE) method facilitated detection of >5500 cysteine sites across ~3000 proteins displaying RNA-sensitive alterations in probe binding. Importantly, PACCE enabled functional profiling of canonical RNA-binding domains as well as discovery of moonlighting RNA binding activity in the human proteome. Collectively, we present a chemoproteomic platform for global quantification of protein-RNA binding activity in living cells.


SUPPLEMENTARY FIGURES
Supplementary Figure 1.Reaction mechanism of purine-based probe molecules.(A) The proposed nucleophilic aromatic substitution (SNAr) reaction of CEPs with nucleophilic groups.(B) HPLC assay for measuring solution reactivity of CEPs.Time-dependent reactions were performed between nucleophiles (10.8 mM) with CEPs (AHL-Pu-1 or AHL-Pu-2; 9.8 mM).Tetramethylguanidine (TMG) was included as a base to facilitate covalent reaction.The following nucleophiles were chosen to mimic amino acid side chain groups: butanethiol (cysteine), nbutylamine (lysine), p-cresol (tyrosine), butyric acid (aspartate/glutamate), and propionamide (asparagine/glutamine). (C) Representative example of HPLC analysis of AHL-Pu-1 reaction to form the butanethiol-CEP adduct.Covalent reaction at C6 to form the AHL-Pu-1-butanethiol adduct was confirmed by retention times that matched those of the synthetic standard (Pa-1, D).See Supplementary Methods for additional details.Data shown are representative of three independent experiments (n=3).(A) Comparison of individual CEP reactivity against nucleophiles that mimic side chain functional groups of the indicated amino acid.See Supplementary Figure 1 for the set of nucleophiles tested.(B) Comparison of AHL-Pu-1 vs AHL-Pu-2 reactivity against nucleophiles in solution.HPLC assay was performed as depicted in Supplementary Figure 1 and described in Supplementary Methods.Data shown are representative of three independent experiments (n=3).(C) Stability of AHL-Pu-1 and AHL-Pu-2 in phosphate-buffered saline (PBS) buffer.Solutions of CEPs (9.8 mM) were prepared and HPLC analysis of these probes measured at the indicated time points.Negligible degradation, as determined by reduction of CEP signal, was observed after 48 hrs (2 days).See Supplementary Methods for additional details of the stability assay.Data shown are representative of three independent experiments (n=3).Supplementary Figure 7. CEP competition against purine and pyrimidine bases.The proteinbinding profiles of AHL-Pu-1-labeled HEK293T proteomes are competed by pretreatment of purines but not pyrimidines in vitro as determined by gel (A) and quantitative chemical proteomics (B).These studies provide evidence that AHL-Pu-1 covalent binding activity is dependent on purine recognition.Quantitative chemical proteomics showed that CEP-modified proteins (C) and sites (D) are largely competed with purine (25 mM) and adenine (2.5 mM) but not uracil (2.5 mM) or cytosine (2.5 mM).Proteomes were co-treated with nitrogenous bases at the indicated concentrations for 30 min at 37 °C and AHL-Pu-1 (25 µM, 30 min, 37 °C).Data shown are representative of n=3 biologically independent experiments.See Supplementary Figure 5 and Supplementary Methods for additional details.   9. Identifying optimal conditions for 4SU metabolic labeling of cellular RNA in HEK293T cells.Optimizing non-toxic conditions for metabolic incorporation of 4SU into cellular RNA using RNA dot blots (A, B) and agarose gel analyses following published methods 4 .(C) UV irradiation at 312 nm to crosslink 4SU-labeled cellular RNA to proteins was confirmed by agarose gel-shift assays.Photocrosslinking of native RNA to proteins was also observed at this wavelength.(D) Photocrosslinking using optimized 4SU conditions (100 µM, 16 hr) did not result in overt toxicity to cells.Data shown are representative of n=3 biologically independent experiments.Details on statistical tests used can be found in Supplementary Methods.

-4SU +4SU
Site-specificity     Supplementary Figure 15.Identification of known RBPs using PACCE.(A) Location and distance of RS-Cys sites to RNA in RBP-RNA structures: DDX17, C298 (6UV2, X-ray); SF3B6, C83 (7Q4O, X-ray).Cys residues reside in canonical RBDs, including RRM and Helicase-ATP binding domains.DDX17 probe-modified peptide is shared with DDX5.Distances were calculated using Pymol.Details on statistical analysis can be found in Supplementary Methods.Data shown are mean + SEM for n=3 biologically independent replicates for each treatment condition.*p ≤ 0.05.(B) Mass Spectrometry analysis of RNA sensitive, AHL-Pu-1-modified tryptic peptide on DDX17/DDX5 (C298/C221).Covalent addition of AHL-Pu-1 onto the Cys residues results in a modified Cys (C*) with a mass addition of +604.2631Da.In addition to standard b-and yfragment ions, internal fragment ions due to fragmentation of AHL-Pu-1 probe are highlighted in teal or pink, respectively.Internal fragmentation of y-and b-fragment ions that contain a portion of the probe are denoted by iX and fX annotations, respectively.Yellow peaks denote desthiobiotin fragments.Data shown are representative of 3 biologically independent replicates.The average SR value across biological replicates can be found in Supplementary Data 4. (C) Modified sequence and MS2 fragment ion annotation of the SF3B6 C83 site.16.Mass Spectrometry analysis of the RS-Cys site on DDX19B (C393).Modified sequence and MS2 fragment ion annotation of VLVTTNVC*AR peptide from DDX19B.This Cys site is also found in DDX19A (C392).See Supplementary Figure 15 for details on the annotation and interpretation of MS2 data.Iodoacetamide alkyne was synthesized and characterized according to previously published literature 10 .

General Protocol for synthesis of purine compounds
The purine base (21.7 mmol 1.0 eq), dimethylformamide (DMF, 100 mL), potassium carbonate (K2CO3, 21.7 mmol, 1.0 eq) and propargyl bromide (80 wt% in toluene, 21.7 mmol, 1.0 eq) were mixed in a round bottom flask.The reaction was stirred under nitrogen at room temperature for 12 hrs.The reaction was treated with water (400 mL) and extracted with ethyl acetate (3 x 100 mL).
The combined organic layer was dried over sodium sulfate and concentrated to a tan solid.This solid was dissolved in chloroform (100 mL).The solution was concentrated and heated to reflux to dissolve all the solids.Upon cooling a white crystalline solid formed which was isolated by filtration.The solid was rinsed with fresh chloroform (20 mL) and heptane (20 mL) to give the N-9 substituted product after air drying.The filtrate contained a mixture of the N-7 and N-9 products.
These were separated using the Biotage flash chromatography system (5% acetone to 20% acetone/chloroform) to afford respective products.concentrated to give crude product that was purified on the Biotage flash chromatography system (20% ethyl acetate to 80% hexanes) to give 650 mg of product as an off-white solid.

Purity of CEP probes
CEP probes were prepared in ACN (10 mM final).The compound stock (50 μL) was then mixed with 10 μL of ACN.This sample mixture was injected (1 μL) and analyzed by reverse-phase HPLC on a Shimadzu 1100 Series spectrometer with UV detection at 254 nm.Chromatographic separation was performed using a Phenomenex Kinetex C18 column (2.6 μM, 50 x 4.6 mm).
The purity of AHL-Pu-1 (A) and AHL-Pu-2 (B) was determined to be ³95% by using HPLC method A.
(A) AHL-Pu-1 PBS stability chromatograms.Compound integrity after 48 hrs was determined to be >99% by HPLC method A. (B) AHL-Pu-2 PBS stability chromatograms.Compound integrity after 48 hrs was determined to be >92% by HPLC method A.

HPLC analysis of compound reactivity
HPLC Method B: Probes were dissolved in 500 µL DMF-ACN solution and stirred on ice with TMG and the amino acid mimetics.At the indicated time point, a 50 µL aliquot was removed and quenched in a solution of acetic acid and caffeine.Solutions were analyzed by HPLC and consumption of CEP probe was quantified as described in the Methods section.The HPLC gradient from Method A was used in these assays.
*Crystallography was performed on AHL-Pu-2 and the unit cell determination showed an exact match to a previously published compound that is consistent with the expected product 12 (CCDC 638951).

2 CSupplementary Figure 3 .
Live cell labeling using CEPs.(A) Gel-based ABPP analysis of DM93 cells treated with CEP probes using optimized treatment conditions for LC-MS/MS quantitative chemical proteomics.DM93 cells were treated with 25 µM of AHL-Pu-1 or AHL-Pu-2 for 4 hr at 37 °C.After treatment, cells were lysed, probe-modified soluble (left panel; 2 mg/mL) and membrane proteomes (right panel; 2 mg/mL) subjected to CuAAC with rhodamine-azide followed by SDS-PAGE analysis and in-gel fluorescence scanning.(B) Concentration-dependent labeling of DM93 cells treated with CEP probes.DM93 cells were treated with indicated concentrations of AHL-Pu-1 or AHL-Pu-2 for 4 hr at 37 °C followed by gel-based ABPP analyses.(C) Timedependent labeling of DM93 cells treated with CEP probes (25 µM of AHL-Pu-1 or AHL-Pu-2 for the indicated times at 37 °C) and subjected to gel-based ABPP analyses.(D) Cell viability of DM93 cells treated with AHL-Pu-1 (25 µM, 4 hr, 37 °C) as determined by the WST-1 assay for cell proliferation and viability.Cell viability was not statistically significantly different between DMSO vehicle and AHL-Pu-1 treated cells (p = 0.4).Statistical significance was determined using a Mann-Whitney test.Data shown are representative of n=3 biologically independent experiments.

Supplementary Figure 4 .
Experimental conditions used for PACCE in HEK293T cells do not generally alter the transcriptome or proteome.(A) Normalized expression values (fragments per kilobase of transcript per million mapped reads [FPKM]) as determined by paired-end RNA-Seq.Respective samples were treated with either 4SU (100 µM, 16 hr), CEP (25 µM, 1 hr), or both at 37 °C.Cells were processed using a PureLink RNA Mini Kit.(B) Principal component (PC) analysis results derived from RNA-Seq samples.The X and Y axes indicate PC1 and PC2, which explain 14% and 19% of the total variation, respectively.The correlation scatter plot of SR values (log2) of different cell treatments compared to DMSO (C) or PACCE condition (CEP+4SU; D) to assess proteomic alterations.Peaks defined by log2(L/H ratios) >5 were set to 5. Data shown are representative of n=3 biologically independent experiments.

Supplementary Figure 6 .
).Schematic corresponds to Figure1B-D.(D)Competitive workflow to evaluate inhibitor activity in proteomes.Cell proteomes derived from either light-or heavy-labeled cells were co-treated with DMSO vehicle or nitrogenous base (0.025-25 mM, 30 min, 37 °C), respectively, and CEP probe (25 µM).Non-competed sites are expected to show equivalent probe labeling intensity in vehicle (L)-and fragment (H)-treated conditions (SR~1).Nitrogenous base-competed sites are identified by probe-modified peptides showing a substantial reduction in peptide abundance (due to competition of CEP labeling) in nitrogenous base-compared with vehicle-treated control samples (SR >>1).This workflow was used for LC-MS/MS studies shown in Supplementary Figure7.Additional details of the chemoproteomic assays can be found in Supplementary Methods.Amino acid preference of CEPs in live cell environment.Distribution of CEP modifications on nucleophilic amino acid residues from probe-modified peptides of detected proteins (soluble proteomes) from quantitative chemoproteomic analyses of CEP-treated DM93 cells.Data shown are high confidence sites (Byonic score > 600) for AHL-Pu-1 (A) and AHL-Pu-2 (B) and representative of n=3 biologically independent experiments.
in e A d e n in e C y t o s in e U r a c il Supplementary Figure 8. Structure-activity relationship (SAR) of CEP analogs.DM93 cells were treated with AHL-Pu-1 (N7 alkyne handle) or AHL-Pu-2 (N9 alkyne handle) at 25 µM for 4 hr (37 °C).(A) Venn diagram of overlapping sites between CEP analogs.Functional protein classes enriched in CEP-treated DM93 cells as determined by Gene Ontology

Supplementary Figure 10 .Supplementary Figure 11 .
PACCE conditions for quantifying RNA-sensitive cysteine (RS-Cys) sites.(A) UV crosslinking of RNA to proteins protects cysteines located in or proximal to RNA-binding sites from CEP labeling.PACCE captures RS-Cys sites competed (SR >2) by crosslinking (i) 4SU-RNA and (ii) 4SU-and native-RNA.RS-Cys sites (5,530, B) and proteins (3,018, C) were determined by the aggregate of all sites that show sensitivity to RNA crosslinking competition from PACCE in situ (live cell labeling using CEP probe).(D) Average SR value for peptides found in respective groups.The number of reported sites per group is highlighted.Data shown are representative of n=3 independent experiments from PACCE studies in HEK and DM93 cells.Comparison of PACCE using CEP (in vitro and in situ) and IAalkyne (in vitro) probe in HEK293T cells and proteomes.(A) Venn diagram of overlapping sites between IA-alkyne and CEP.(B) Domain enrichment analysis of CEP-and IA-alkyne modified sites identified from HEK soluble and membrane fractions.Domain enrichment analyses were performed as previously described 5 .Data shown are representative of n=3 biologically independent experiments.

Supplementary Figure 12 .Supplementary Figure 13 .Supplementary Figure 14 .
tRNA (guanine(26)-N(2))-dimethyltransferase Basic-leucine zipper (bZIP) domain Circularly permuted (CP)-type guanine nucleotide-binding (RNase treatment reduces RS-Cys detection by PACCE in HEK293T cell proteomes.(A) Workflow for evaluating effects of RNase treatment on RS-Cys detection by PACCE.SILAC light and heavy cells are cultured in the absence or presence of 4SU-RNA, respectively.Cells were lysed and proteomes exposed to UV irradiation followed by CEP probe labeling and quantitative chemical proteomics to identify RS-Cys sites (SR >2).The role of crosslinked 4SU-RNA in protecting RNA-binding sites from CEP labeling was verified by addition of RNase to proteomes prior to UV irradiation.(B) Quantitation of RNase treatments in PACCE studies.RNase treatment of lysates resulted in a 60% reduction in the number of RS-Cys sites detected [a total of 934 and 317 RS-Cys sites in (-)RNase and (+)RNase sample groups, respectively].Data shown are mean + SEM for n = 3 biologically independent experiments.RS-Cys-containing proteins compared with RBPs detected by RIC methods and the curated human RBP dataset 3,6-9 .Inset depicts coverage of Human Annotated RBPs by CEP and RIC methods.PACCE detects an additional 346 RBPs not captured by the 5 existing RIC methods.Comparison of PACCE versus other proteomic methods for RBP analysis.(A) Descriptions of mass spectrometry based RBP identification methods.(B) Table of key advantages and disadvantages for each respective method compared.

Supplementary Figure 17 .
Mass Spectrometry analysis of the RS-Cys site on EXOC4 (C957).(A)MS2 fragment ion annotation of LKEIIC*EQAAIK peptide from EXOC4.Inset contains a schematic of fragments from probe and internal fragmentation.(B) EXOC4 domains.The RS-Cys site is highlight with a red arrow.See Supplementary Figure15for details on the annotation and interpretation MS2 data.
Spectrometry analysis of the RS-Cys sites on UIMC1.(A) UIMC1 domains and sequence alignments showing conservation of RS-Cys sites.(B) Modified sequence and MS2 fragment ion annotation of AIAESLNSC*RPSDASATR peptide from UIMC1 C121.(C) Modified sequence and MS2 fragment ion annotation of SFVSISEATDC*LVDFKK peptide from UIMC1 C691.See Supplementary Figure 15 for details on the annotation and interpretation of MS2 data.
Zinc-finger like AIR A Supplementary Figure 20.Analysis of RBP binding capacity using deletion mutants.Diagram of PTBP1 (A) and EXOC4 (B) wild-type (WT) protein and domains/regions deleted in corresponding mutants.Red lines for EXOC4 represent a 20 amino acid deletion surrounding the RS-Cys C957 site.(C) Representative western blot comparing RNP formation of wild-type (WT) and corresponding RBP mutants in HEK293T cells used for quantitation.Data shown are representative of n=6 independent experiments.: p-Cresol, n-Butylamine Acros: 1,1,3,3-Tetramethylguanidine, 99% (TMG), Butyric acid, Propargyl bromide (